To see our GitHub repository, click here.
To see our Shiny application, click here.
rr sessionInfo(package=NULL)
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] readr_1.1.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.10 digest_0.6.11 rprojroot_1.2 R6_2.2.0 jsonlite_1.4 backports_1.0.5 magrittr_1.5
[8] evaluate_0.10 stringi_1.1.2 rmarkdown_1.3 tools_3.3.2 stringr_1.1.0 hms_0.3 yaml_2.1.14
[15] base64enc_0.1-3 htmltools_0.3.5 knitr_1.15.1 tibble_1.3.0
This data set is from kaggle. You can find this data here. It provides detailed information about libraries in each state in the United States.
rr summary(states)
Submission_Year State State_Code Region_Code Service_Population Service_Population_Without_Duplicates
Min. :2015 Length:51 Min. : 1.00 Min. :1.000 Min. : 582658 Min. : 582658
1st Qu.:2015 Class :character 1st Qu.:16.50 1st Qu.:3.000 1st Qu.: 1493456 1st Qu.: 1472869
Median :2015 Mode :character Median :29.00 Median :5.000 Median : 4395295 Median : 4395295
Mean :2015 Mean :28.96 Mean :4.471 Mean : 6181043 Mean : 6021934
3rd Qu.:2015 3rd Qu.:41.50 3rd Qu.:6.000 3rd Qu.: 7475148 3rd Qu.: 6764752
Max. :2015 Max. :56.00 Max. :8.000 Max. :38322887 Max. :38322887
State_Population Central_Libraries Branch_Libraries Bookmobiles MLS_Librarians Librarians Employees
Min. : 582658 Min. : 1.0 Min. : 3.0 Min. : 0.00 Min. : 39.88 Min. : 112.5 Min. : 111.0
1st Qu.: 1743730 1st Qu.: 63.0 1st Qu.: 38.5 1st Qu.: 3.00 1st Qu.: 144.50 1st Qu.: 284.2 1st Qu.: 376.4
Median : 4395295 Median :112.0 Median : 95.0 Median : 7.00 Median : 329.58 Median : 679.9 Median :1210.1
Mean : 6195242 Mean :177.5 Mean :150.4 Mean :12.92 Mean : 629.70 Mean : 927.1 Mean :1788.9
3rd Qu.: 6817706 3rd Qu.:237.5 3rd Qu.:207.0 3rd Qu.:17.50 3rd Qu.: 767.76 3rd Qu.:1171.0 3rd Qu.:2340.9
Max. :38340074 Max. :755.0 Max. :950.0 Max. :75.00 Max. :3437.24 Max. :4073.9 Max. :8391.0
Total_Staff Local_Government_Operating_Revenue State_Government_Operating_Revenue Federal_Government_Operating_Revenue
Min. : 245.3 Min. :0.000e+00 Min. : 0 Min. : 0
1st Qu.: 723.2 1st Qu.:4.424e+07 1st Qu.: 1205048 1st Qu.: 247294
Median : 1817.3 Median :1.185e+08 Median : 4033130 Median : 497265
Mean : 2716.0 Mean :2.016e+08 Mean : 16668047 Mean : 895589
3rd Qu.: 3519.8 3rd Qu.:2.374e+08 3rd Qu.: 12342780 3rd Qu.:1070844
Max. :12464.9 Max. :1.265e+09 Max. :345037953 Max. :5792339
Other_Operating_Revenue Total_Operating_Revenue Salaries Benefits Total_Staff_Expenditures
Min. : 180898 Min. :1.830e+07 Min. : 8517104 Min. : 229606 Min. : 10890655
1st Qu.: 4018902 1st Qu.:5.172e+07 1st Qu.: 24806427 1st Qu.: 8439042 1st Qu.: 33433702
Median : 9247116 Median :1.306e+08 Median : 62750473 Median : 22988790 Median : 85739263
Mean : 17522202 Mean :2.367e+08 Mean :109245728 Mean : 39733389 Mean :148979117
3rd Qu.: 17584116 3rd Qu.:2.848e+08 3rd Qu.:139643819 3rd Qu.: 45490279 3rd Qu.:190242298
Max. :191656418 Max. :1.355e+09 Max. :605134988 Max. :281367947 Max. :876111285
Print_Collection_Expenditures Digital_Collection_Expenditures Other_Collection_Expenditures Total_Collection_Expenditures
Min. : 1732866 Min. : 158409 Min. : 76731 Min. : 2432733
1st Qu.: 3703121 1st Qu.: 1161444 1st Qu.: 756074 1st Qu.: 5457850
Median :10136582 Median : 2887735 Median : 3217904 Median : 15558992
Mean :14738013 Mean : 5356909 Mean : 4911608 Mean : 25006531
3rd Qu.:18743538 3rd Qu.: 6431658 3rd Qu.: 6629340 3rd Qu.: 34862064
Max. :68898041 Max. :24944727 Max. :32163245 Max. :112729077
Other_Operating_Expenditures Total_Operating_Expenditures Local_Government_Capital_Revenue State_Government_Capital_Revenue
Min. : 3944576 Min. :1.731e+07 Min. : 0 Min. : 0
1st Qu.: 9923100 1st Qu.:4.882e+07 1st Qu.: 1998838 1st Qu.: 0
Median : 26913847 Median :1.260e+08 Median : 7588520 Median : 49290
Mean : 48396075 Mean :2.224e+08 Mean :10798721 Mean : 2419471
3rd Qu.: 56365576 3rd Qu.:2.680e+08 3rd Qu.:16868779 3rd Qu.: 1125000
Max. :326843919 Max. :1.286e+09 Max. :50568512 Max. :25243857
Federal_Government_Capital_Revenue Other_Capital_Revenue Total_Capital_Revenue Total_Capital_Expenditures Print_Collection
Min. : 0 Min. : 0 Min. : 190956 Min. : 347748 Min. : 1642715
1st Qu.: 0 1st Qu.: 310920 1st Qu.: 4831600 1st Qu.: 5074609 1st Qu.: 4679018
Median : 31980 Median : 1289455 Median : 9698454 Median : 10372327 Median : 9442086
Mean : 495817 Mean : 4071006 Mean : 17785016 Mean : 22637427 Mean :15056932
3rd Qu.: 211338 3rd Qu.: 3236141 3rd Qu.: 23468550 3rd Qu.: 28077506 3rd Qu.:17167508
Max. :18521642 Max. :87067000 Max. :116604806 Max. :126989107 Max. :70815265
Digital_Collection Audio_Collection Downloadable_Audio Physical_Video Downloadable_Video Local_Cooperative_Agreements
Min. : 22966 Min. : 86416 Min. : 9338 Min. : 119605 Min. : 0 Min. : 0
1st Qu.: 399830 1st Qu.: 230386 1st Qu.: 222568 1st Qu.: 324603 1st Qu.: 5026 1st Qu.: 395
Median : 1647955 Median : 593488 Median : 564568 Median : 814043 Median : 21580 Median : 888
Mean : 4194353 Mean : 893410 Mean : 1300921 Mean :1209907 Mean :105070 Mean : 1909
3rd Qu.: 3683356 3rd Qu.:1130140 3rd Qu.: 1358018 3rd Qu.:1437894 3rd Qu.: 71680 3rd Qu.: 1865
Max. :47106859 Max. :3530931 Max. :11064932 Max. :5154677 Max. :922536 Max. :17442
State_Licensed_Databases Total_Licensed_Databases Print_Subscriptions Hours_Open Library_Visits
Min. : 0 Min. : 74 Min. : 2071 Min. : 65208 Min. : 2186038
1st Qu.: 1785 1st Qu.: 2810 1st Qu.: 6368 1st Qu.: 293115 1st Qu.: 7430782
Median : 4275 Median : 5003 Median : 14972 Median : 551274 Median : 18178677
Mean : 6252 Mean : 8161 Mean : 26915 Mean : 721113 Mean : 27947560
3rd Qu.: 8093 3rd Qu.:11726 3rd Qu.: 36913 3rd Qu.: 922803 3rd Qu.: 35273654
Max. :26371 Max. :28951 Max. :166683 Max. :2392554 Max. :164300175
Reference_Transactions Registered_Users Circulation_Transactions Interlibrary_Loans_Provided Interlibrary_Loans_Received
Min. : 373837 Min. : 281392 Min. : 3938767 Min. : 0 Min. : 57
1st Qu.: 911614 1st Qu.: 924534 1st Qu.: 9299584 1st Qu.: 57346 1st Qu.: 62142
Median : 3483395 Median : 2476596 Median : 27866711 Median : 266163 Median : 302038
Mean : 5157088 Mean : 3372037 Mean : 45376494 Mean : 1385314 Mean : 1382256
3rd Qu.: 6240569 3rd Qu.: 4122706 3rd Qu.: 61758828 3rd Qu.: 906895 3rd Qu.: 901136
Max. :28105028 Max. :21723648 Max. :222788583 Max. :11912575 Max. :11469527
Library_Programs Childrens_Programs Young_Adult_Programs Library_Program_Audience Childrens_Program_Audience
Min. : 7940 Min. : 5265 Min. : 1172 Min. : 183507 Min. : 145049
1st Qu.: 25835 1st Qu.: 15432 1st Qu.: 2356 1st Qu.: 609558 1st Qu.: 388609
Median : 62120 Median : 37852 Median : 6168 Median :1396310 Median :1032396
Mean : 87964 Mean : 50464 Mean : 8322 Mean :2000421 Mean :1376584
3rd Qu.:101758 3rd Qu.: 64567 3rd Qu.: 9664 3rd Qu.:2450672 3rd Qu.:1725636
Max. :500005 Max. :213468 Max. :55526 Max. :9491467 Max. :6909344
Young_Adult_Program_Audience Public_Internet_Computers Internet_Computer_Use Wireless_Internet_Sessions Start_Date
Min. : 12589 Min. : 524 Min. : 577405 Min. : -1 Length:51
1st Qu.: 33708 1st Qu.: 1556 1st Qu.: 1895499 1st Qu.: 198808 Class :character
Median : 94146 Median : 4726 Median : 4465464 Median : 1266660 Mode :character
Mean :131304 Mean : 5610 Mean : 6320727 Mean : 2926421
3rd Qu.:160931 3rd Qu.: 6724 3rd Qu.: 7600968 3rd Qu.: 4223684
Max. :725295 Max. :21735 Max. :35000501 Max. :15224387
End_Date
Length:51
Class :character
Mode :character
This data is a subset of the library data that was re-configured to make Digital Cost, Other Cost, Print Cost, and Total Cost to be a subset of a larger category – Collection Cost Type. This way, the data could be easily grouped on the same visualization.
rr summary(states_boxplot)
State Category Cost
Length:204 Length:204 Min. : 76731
Class :character Class :character 1st Qu.: 2429076
Mode :character Mode :character Median : 5843598
Mean : 12503265
3rd Qu.: 15685597
Max. :112729077
This data is a subset of the library data that was re-configured to make Children’s Programs, Young Adult Programs, and Adult Programs to be a subset of a larger category – Program Type. This way, the data could be easily grouped on the same visualization.
rr summary(Program_Category)
State Program_Category Num_Programs
Length:153 Length:153 Min. : 1172
Class :character Class :character 1st Qu.: 7381
Mode :character Mode :character Median : 22070
Mean : 48917
3rd Qu.: 62120
Max. :500005
This data set is from the census data on data.world. You can find this data here. It provides the number of employed persons in each state.
rr summary(census_employment)
State Employed
Length:52 Min. : 456640
Class :character 1st Qu.: 1398759
Mode :character Median : 3335384
Mean : 4886644
3rd Qu.: 5501786
Max. :30312429
This data set is from the census data on data.world. You can find this data here. It provides the number of people enrolled in high school in each state.
rr summary(census_enrollment)
State Enrollment_9to12
Length:52 Min. : 24198
Class :character 1st Qu.: 91860
Mode :character Median : 215712
Mean : 331031
3rd Qu.: 362992
Max. :2216175
This data set is from the census data on data.world. You can find this data here. It provides the median family income in each state.
rr summary(Median_Family_Income)
State B19119_001
Length:52 Min. :22976
Class :character 1st Qu.:57986
Mode :character Median :65813
Mean :66551
3rd Qu.:74030
Max. :90089
This data set is from Current Results. You can find this data here. It provides the average temperature in each state.
rr summary(State_Temp_and_Rain)
State Average Temperature Total Hours of Sunlight Clear Days
Length:50 Min. :26.60 Min. :2061 Min. : 58.00
Class :character 1st Qu.:45.25 1st Qu.:2514 1st Qu.: 89.25
Mode :character Median :51.20 Median :2690 Median :100.50
Mean :51.94 Mean :2721 Mean :103.26
3rd Qu.:58.65 3rd Qu.:2924 3rd Qu.:115.00
Max. :70.70 Max. :3806 Max. :193.00
NA's :3
This data set is from Researcher Tools. You can find this data here. It provides useful connections between state names, state codes, and regions.
rr summary(states_with_regions)
State State Code Region Sub-Region
Length:50 Length:50 Length:50 Length:50
Class :character Class :character Class :character Class :character
Mode :character Mode :character Mode :character Mode :character
Map of the United States color-coded by region. This is an interactive map that allows you to filter what is shown on sheet Library Programs vs. Visits and Program Breakdown by Type.
Dual combination showing the number of library programs per 10,000 people and the annual visits per capita for each state. Two calculated measured were created for this visualization.
Additionally, a set was created to show High Visit States which is a grouping of states who average more than 5.5 visits per person per year.
The South region has zero states in the High Visit States set. The other three regions have either 4 or 5 states each. This implies that libraries in the south could benefit from talking to the other regions about the kinds of programs they are implementing to get people to go to the library. Additionally, the states in the southern region have noticeably fewer library programs per 10,000 residents. Looking at all of the regions, there is a general positive correlation between the number of programs per 10,000 people and the average number of visits per person. Therefore, if a state wanted to increase the number of visits to public libraries a larger budget could be allocated for library programs.
A stacked bar graph showing the percentage breakdown of total library programs into those for children, adolescents, and those for adults. To determine this, three measure calculations were used, however, these were not stored as new measures. These calculations changed the data from number of programs to a percentage of total programs for each of the three program categories.
In general, if less than 58% of states library programs are aimed at children then the annual number of visits per capita is lower. Additionally, southern states (on average) offer a lower percentage of children programs than the other three regions. This could be a contributing factor to the lower number of visits to libraries in the south.
This is a dashboard for the previous three visualizations.
Box and whisker plot showing the total digital collection cost, print collection cost, other collection cost, and total collection cost for each state. The visualization has four pages, one for each region.
All four regions spend the most on print collections with digital and other (such as audio) collection costs being roughly similar. Heavily populated states (Texas, California, New York, Ohio) fall outside the interquartile range for collection costs due to the significantly increased volume of collection material required in these larger states.
Scatterplot showing the relationship between the total collection cost and size for each state color-coded by region with a trendline for each region.
From the regional trend lines on this visualization, it can be seen that Midwest and Northeast spend less per collection item than South and West regions. The South and West regions should consult with the material procurement teams in the Midwest and Northeast regions to determine how to reduce collection costs.
This is a dashboard for the previous two visualizations.
A scatter plot showing how the average annual temperature in a state is related to the average annual visits per capita. The average annual visits per capita was found using the following formula: [Library_Visits]/[State_Population]. The points are color-coded by region and have a trend line with a 95% confidence bands.
There is a general trend that the higher the average temperature in a state the fewer number of visits per capita to the libraries. This is a logical, yet interesting result, as warmer temperatures lend themselves more towards being active and outdoors than attending the library.
A crosstab of operational revenue to cost ratio. A red-green scale was used to color the average ratio value of Profit to Expense Ratio with a ratio of 1.0 selected as the transition point between the colors. The crosstab is created with values given by state and separated by sub-region.
Profit to Expense Ratio = [Total_Operating_Revenue]/[Total_Operating_Expenditures]
New England has the lowest revenue to cost ratio. Most of the region’s libraries are profiting but no library is particularly profitable.
A filled map of the United States showing the average number of computers per library for each state. A KPI was created for this statistic and the filled color of each individual state is coordinated to whether or not the KPI is low, medium, or high and the range for each category of the KPI can be adjusted by the user. KPI Low was set to 12 and KPI Medium is set to 19.
KPI Computers per Library =
IF AVG([Public_Internet_Computers] / ([Central_Libraries] + [Branch_Libraries])) <= [KPI Low] THEN “Low” ELSEIF AVG([Public_Internet_Computers] / ([Central_Libraries] + [Branch_Libraries])) <= [KPI Medium] THEN “Medium” ELSE “High” END
KPI Low: ranges from 1 to 12 KPI Medium: ranges from 13 to 20
The midwest has the least computers per library compared to other regions. The South has the most computers per library.
A histogram of showing the annual computer usage and the states that fall into each bin with the bin size being 2.5 million. The columns are labelled with each state fall into the bin. Annual computer usage is measured in number of sessions logged.The KPI Computers per Library was used again in this case. KPI Low was set to 12 and KPI Medium is set to 19.
States that have medium to high amounts of amounts of computers per libraries (blue and red blocks) also tend to have higher amounts of computer usage. This could indicate that the extra computers available to the public allow for more computer usage.
This is a dashboard for the previous two visualizations.
This boxplot shows the minimum, maximum, first quartile, third quartile, and median of the “Cost” values for each expenditure, including Digital Collection Expenditures, Print Collection Expenditures, Other Expenditures, and Total Expenditures. The user may select the “Cost Range” that they would like to see.
This is interesting because you can see how many states are outliers in spending on Digital and Print Expenditures.
This histogram shows the number of Librarians in each state.
This is interesting because you can see the overall trend of lot of states having relatively little Librarians (>300), while one state, New York, has almost a thousand more librarians than the second highest state, CA, created a gap in the histogram.
This graph shows how different states compare in terms of Library Visits per Median Family Income. This uses a join with the 2015 Census data.
This is interesting because it shows a trend that richer states have more library visits. We were surprised by this because libraries provide many services for free that would benefit low income families.
This graph shows the Cost per Category per State. The graph is the colored according to the table calculation: sum(Library_Visits) /sum(Service_Population_Without_Duplicates. The limits for the KPI can be selected by the user.
This is interesting because you can see that Texas has some of the highest library costs in the country, but still has a relatively low number of visitors per service population.
This graph shows the number of Librarians per State, and the fill of the graph is colored according to the table calculation of the number of citizens per Librarian.
This is interesting because you can see that GA has a relatively low number of overall librarians, but has by far the highest ratio of librarians per citizen.
This graph shows compares the total open library hours and the high school enrollment in each state.
This is interesting because, generally speaking, the states with the highest hours open have the highest high school enrollment.